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Abstract 

An  object  recognition  approach  based  on  concurrent 
coarse- and- fine  matching  using  a  multi-layer  Hopfield 
neural  network  is  presented.  The  proposed  network 
consists  of  several  cascaded  single  layer  Hopfield  net¬ 
works,  each  encoding  object  features  at  a  distinct  res¬ 
olution,  with  bidirectional  interconnections  linking  ad¬ 
jacent  layers.  The  interconnection  weights  between 
nodes  associating  adjacent  layers  are  structured  to  fa¬ 
vor  node  pairs  for  which  model  translation  and  ro¬ 
tation.  when  viewed  at  the  two  corresponding  resolu¬ 
tions,  are  consistent.  This  inter-layer  feedback  feature 
of  the  algorithm  reinforces  the  usual  intra-layer  match¬ 
ing  process  in  conventional  single  layer  Hopfield  nets 
in  order  to  compute  the  model-object  match  which  is 
most  consistent  across  several  resolution  levels.  The 
performance  of  the  algorithm  is  demonstrated  in  cases 
of  images  containing  single  and  multiple  occluded  ob¬ 
jects.  These  r  suits  are  compared  with  recognition  re¬ 
sults  obtained  using  a  single  layer  Hopfield  network. 

1  Introduction 

Object  recognition  has  emerged  as  a  subject  of  wide 
research  interest  during  the  last  decade1.  Two  com¬ 
mon  themes  characterizing  much  of  the  recent  work 
have  been  the  use  of  a  priori  information  in  the  form 
of  models  and  constraints,  and  the  incorporation  of  the 
most  current  image  processing  tools  to  enhance  recog¬ 
nition  performance.  In  this  spirit,  the  objective  of  the 
present  study  is  to  explore  the  use  of  multi-resolution 
(pyramidal)  image  representation  in  the  context  of  re¬ 
cently  reported  neural  network  implementation  tech¬ 
nology,  with  the  goal  of  faster  and  more  robust  auto¬ 
mated  object,  recognition  performance. 

It  is  natural  to  seek  object  recognition  cues  con¬ 
currently  at  several  resolution  levels.  Multi-resolution 
image  representation  and  processing  is  a  well  known 
image  analysis  methodology.  A  multi-resolution  im¬ 
age  representation  can  be  viewed  as  an  image  pyra¬ 
mid.  Important  classes  of  image  pyramids  include  the 

1  ThU  work  was  partially  supported  by  Amherst  Systems, 
Inr.,  30  Wilson  Road,  Buffalo  NY  14221-7082,  under  Office  of 
Naval  Research  contract  number  N00014-U1-O-02.S7. 
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Gaussian  pyramid,  Laplacian  pyramid,  and  subband 
pyramid  [2]. 

The  most  immediate  utility  of  a  multi-resolution 
pyramid  representation  is  that  it  can  reduce  the  com¬ 
putational  cost  of  various  image  search  operations.  A 
major  problem  associated  with  this  hierarchical  strat¬ 
egy  is  that  if  a  mistake  occurs  at  an  early  stage,  the 
low  resolution  error  will  propagate  into  each  subse¬ 
quent  higher  resolution  level  and  finally  a  mismatch 
would  occur.  This  mismatch  could  not  be  corrected 
by  using  the  information  at  any  level  because  the  in¬ 
formation  flows  top-down  in  a  feed-forward  manner 
and  there  is  no  feedback  from  higher  resolution  levels. 
To  address  this  problem,  a  technique  called  “coarse- 
and-fine”  matching  is  proposed  in  this  paper,  where 
top-down  and  bottom-up  matching  are  concurrently 
performed  for  each  pair  of  levels  of  the  image  pyramid 
in  order  to  find  the  best  matched  features  at  each  level 
pair  simultaneously. 

The  proposed  coarse-and-fine  strategy  is  imple¬ 
mented  by  utilizing  a  multi-layer  Hopfield  neural  net¬ 
work.  The  single  layer  Hopfield  neural  network  from 
which  it  derives  has  been  used  in  a  wide  range  of  ap¬ 
plications,  such  as  vision  tasks.  Vision  tasks  can  be 
formulated  as  an  optimization  problem  where  an  en¬ 
ergy  function  is  minimized.  The  search  for  its  global 
minimum  can  be  implemented  through  a  Hopfield  neu¬ 
ral  network  with  interconnection  weights  generating 
an  equivalent  energy  function.  Unfortunately,  there 
typically  exist  multiple  local  minima  in  such  energy 
functions  due  to  its  non-convexity  and  its  argument 
high  dimensionality,  and  a  gradient,  descent  procedure 
is  vulnerable  to  early  termination.  The  Hopfield  net¬ 
work  can  get  trapped  in  any  of  these  local  minima 
depending  on  the  initial  states  of  the  network  and  the 
way  it  selects  the  sequence  by  which  the  states  of  the 
neurons  are  updated. 

In  this  paper,  a  concurrent  (coarse-and-fine)  multi¬ 
resolution  model-based  object  recognition  technique  is 
proposed  using  a  multi-layer  Hopfield  neural  network 
to  alleviate  some  of  these  problems  which  arise  when  a 
single  layer  Hopfield  network  is  utilized.  The  network 
is  structured  as  a  cascade  of  several  single  layer  Hop- 
field  networks  with  interconnections  between  adjacent 
layers.  It  uses  matching  results  in  each  layer  to  rein- 
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force  the  matching  process  for  adjacent  layers  through 
inter-layer  interconnection  weights.  The  values  of  the 
weights  between  nodes  on  distinct  layers  depend  on 
the  intrinsic  characteristics  of  the  multi-’-esolution  rep¬ 
resentation,  that  is,  the  relationship  among  the  multi- 
resolution  features  belonging  to  adjacen'  levels  of  the 
image  pyramid.  Each  layer  of  the  network  implements 
a  matching  process  between  the  scene  and  the  model 
features  which  are  extracted  at  the  corresponding  reso¬ 
lution  level  of  the  image  pyramid.  However,  each  layer 
of  the  proposed  network  communicates  with  adjacent 
layers  permitting  inter-layer  feedback  during  match¬ 
ing.  Thus  good  matches  at  multiple  levels  reinforce 
one  another,  and  matches  at  one  level  which  are  not 
corroborated  at  adjacent  resolution  levels  do  not  prop¬ 
agate  as  strongly.  Moreover,  the  equivalent  energy 
function  for  the  multi-layer  Hopfield  network  is  shown 
to  be  smoothed  relative  to  the  single  layer  case,  mit¬ 
igating  the  local  minima  problem,  and  the  examples 
are  shown  to  converge  to  local  minima  reasonably  close 
to  the  global  minimum. 

This  paper  is  organized  as  follows.  In  Section  2, 
scene  and  model  pyramids  are  discussed.  In  Section  3, 
the  multi-layer  Hopfield  neural  network  is  introduced. 
In  Section  4,  the  performance  of  this  network  is  com¬ 
pared  with  that  of  a  single  layer  Hopfield  neural  net¬ 
work  for  recognizing  image  scenes  containing  single 
ohjects  and  multiple  occluded  objects.  Conclusions  is 
given  in  Section  5. 

2  Scene  and  Model  Pyramid 

In  order  to  implement  the  matching  based  on  a 
multi-layered  Hopfield  network  at  multiple  resolution 
levels  of  images,  first,  an  image  pyramid  is  constructed 
for  each  model.  A  QMF  filter  [2]  is  employed  in  this 
paper  to  build  the  subband  pyramids  for  each  model 
and  the  input  scene.  The  feature  primitives  that  are 
utilized  in  this  paper  are  the  high  curvature  points 
(corners).  Therefore,  a  polygon  approximation  algo¬ 
rithm  [3]  is  used  on  the  boundaries  of  objects  to  obtain 
the  corners  (vertices)  at  each  level.  The  numerical  fea¬ 
tures  quantifying  a  vertex  are  the  angle  between  the 
two  polylines  that  form  the  vertex  and  the  location  of 
the  vertex.  A  set  of  graphs  are  generated  for  each  2-D 
prototype  object,  where  each  graph  consists  of  a  set 
of  corners  with  their  corresponding  angle  features  and 
distance  features  at  a  particular  level  of  the  pyramid, 
we  call  this  representation  the  model  graph  pyramid. 
All  the  model  graph  pyramids  are  then  integrated  into 
a  single  model-database,  which  is  called  a  global  model 
graph  pyramid.  Similarly,  a  graph  pyramid  can  be  gen¬ 
erated  for  an  input  scene  which  is  called  a  scene  graph 
pyramid.  During  recognition,  the  scene  graph  pyramid 
is  matched  against  the  global  model  graph  pyramid  by 
a  multi-layer  Hopfield  neural  network  to  identify  and 
locate  the  instances  of  the  models  in  the  test  scene  for 
each  level  of  the  pyramid. 

3  Multi-layer  Hopfield  Network 

3.1  Construction 

A  multi-layer  Hopfield  network  consisting  of  sev¬ 
eral  single  layer  Hopfield  networks  cascaded  together 


is  shown  in  Fig.  1.  Inputs  to  each  layer  are  the  fea¬ 
tures  extracted  from  the  corresponding  level  of  the 
model  and  scene  pyramids.  The  nodes  within  each 
layer  are  fully  connected.  The  adjacent  layers  of  the 
multi-layered  network  are  connected  by  a  set  of  in¬ 
terconnection  weights.  For  the  remainder  of  this  pa¬ 
per,  we  will  restrict  ourselves  to  a  two-layered  Hopfield 
neural  network  where  the  fine  features  are  matched  in 
the  first  layer  L\  and  the  coarse  features  are  matched 
in  the  second  layer  Z-2,  with  appropriate  interconnec¬ 
tion  weights. 

3.2  Energy  Function 

To  consider  the  behaviour  of  a  two-layered  Hopfield 
network,  let  the  state  of  the  network  in  layer  L\  be  rep¬ 
resented  by  a  binary  state  vector  At,  the  state  of  the 
second  layer  Li  be  the  state  vector  A,  and  the  state 
vector  for  the  entire  two-layered  network  be  denoted 
by 

4=  [4i.4j]  (1) 

where  the  entire  state  vector  is  the  concatenation  of 
the  state  vectors  of  the  two  layers. 

The  overall  energy  function  representing  the  col¬ 
lective  behaviour  of  the  two-layered  network  can  be 
characterized  by  the  following  energy  function 


E(A)  =  Eiid^+aiEMnEilAil+aiEnM)  (2) 


where  E i(^41 )  is  the  energy  due  to  the  current  state  of 
the  layer  L\,  Ei(A2)  is  the  energy  due  to  the  current 
state  of  the  layer  Li,  En(A)  and  Ei\(A)  are  the  inter¬ 
energy  between  the  state  of  the  layer  L\  and  the  stale 
of  the  layer  Li  and  vice  versa,  »i  is  a  parameter  that 
weights  the  inter-energy  £j2(4)  relative  to  the  energy 
E\(AX),  and  <*2  is  a  parameter  that  weights  the  inter- 
energy  £21(4)  relative  to  the  energy  Ei(A2).  aj  and 
02  can  also  be  considered  as  Lagrange  multipliers. 

The  behaviour  of  the  network  at  layer  L\  can  be 
represented  by  an  energy  function  given  by 

'h(A)  =  El{A1)  +  alEli(A)  (3) 

Similarly,  the  energy  function  for  layer  Li  can  be  writ¬ 
ten  as 

^2(4)  =  Ei(A2)  +  ociEii  (A.)  (4) 

Our  multi-resolution  matching  process  is  defined  as 
minimizing  the  overall  energy  function  E(A)  over  the 
entire  domain  of  state  vector  A  in  order  to  obtain  the 
best  matches  for  each  layer.  To  minimize  the  overall 
energy  function  E(A),  a  node  in  layer  L\  is  randomly 
picked  and  its  state  is  updated,  then  a  node  in  layer  L  > 
is  randomly  chosen  and  its  state  is  updated.  This  pro¬ 
cess  is  repeated  recursively  by  choosing  a  node  from 
layer  L\  followed  by  a  node  from  layer  Li. 


3.3  Connection  Weights  in  Each  Layer 

The  matching  process  in  each  layer  can  be  formu¬ 
lated  as  minimizing  an  energy  function  [1].  For  exam¬ 
ple,  considering  the  matching  process  at  layer  L 1,  and 
assuming  no  interaction  with  layer  Li,  an  energy  func¬ 
tion  that  includes  all  the  constraints  of  the  matching 
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process  can  be  written  as: 


.  M,  Nt  M I  N I 

=  ~2  e  e  e  e 
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Mi  (  N,  \2  /V,  /  A/,  \2 

+  53  y~  !E  V'<***J  +  li  ( 1  -  fl  v'i'fc>) 

(5) 

where  V*,*,  represents  the  degree  of  match  between 
two  features  >t  and  ky  It  takes  a  value  of  1  when 
the  <ith  feature  point  in  the  input  image  matches  the 
Ifcith  feature  point  in  the  model,  otherwise,  it  takes  a 
value  of  0.  N i  is  the  total  number  of  features  in  the 
model  graph  which  is  the  sum  of  all  the  features  in 
the  models.  Mi  is  the  corresponding  number  of  fea¬ 
ture  points  for  the  input  image  in  layer  L\ .  The  first 
term  in  the  energy  function  represents  the  compatibil¬ 
ity  constraint.  The  second  and  the  third  terms  rep¬ 
resent  the  uniqueness  constraint,  i.e.,  for  each  feature 
point  there  can  only  be  one  match.  The  compatibility 
measure  Ci,k,j,i ,  can  be  expressed  as  follows: 


(WF(A„/k,)+ 

r.  . ,  _  )  W% W, ../«.)+ 

<  M*.;.'.  -  \  W3F(di,jlidtxli),  for  ki  and  /,  €  q 
1  —I,  for  fcj  and  /i  ^  q 

(6) 

where  W,  are  the  weighting  factors  and  sum  up  to 
1  (i  e.,53f_j  Wi  —  I),  q  is  gth  model.  The  function 
F(x,y)  is  a  discrete  non-linear  compatibility  function 
which  was  defined  in  [I],  such  that  if  x  and  y  are  com¬ 
patible  then  F(x,y)  has  a  value  1;  otherwise  —1. 

It  has  been  shown  in  [1]  that  the  energy  function  Ei 
in  (5)  above  is  equivalent  to  the  Hopfield-style  energy 
function  [4]  given  by 

j  A#  j  N\  Mi  N\ 

E>  =  ~2  £  £  £  V‘«*<  VM 

i,  =  !*,  =  !>,  =  l/,=l 

Mi  Ni 


where 


on  the  relationship  between  the  mapping  parameters 
of  the  models  obtained  at  each  layer.  The  relation¬ 
ship  can  be  summarized  as  follows:  consistent  trans¬ 
lation  parameters  of  a  model  at  layer  L  i  are  twice  the 
translation  parameters  of  the  same  model  at  layer  L-j, 
and  consistent  rotation  parameters  at  layer  L\  and  L-j 
must  be  the  same.  Therefore,  the  relationship  between 
the  mappings  is 


{txLl  =  2  X 

Hu,  =2x^£ 


where  (<*£,,,  and  are  translation 

parameters  of  models  at  layer  L\  and  Li,  respectively, 
and  $Li  and  6_i^  are  the  rotation  parameters  of  models 
at  layer  L\  ana  Li,  respectively.  The  calculation  of 
txL i  >iyLi  and  0Ll  are  obtained  as  the  average  of  the 
translation  and  rotation  of  each  matched  pair  nodes 
in  5]. 

Using  this  consistency  constraint  between  the  two 
layers,  we  can  define  the  interconnection  weights  be¬ 
tween  the  two  layers  of  the  network.  The  interconnec¬ 
tion  weight  fl«, *,«,*,,  which  is  the  connection  between 
a  node  (i\,ki)  in  layer  L\  and  a  node  (i2, ki)  in  layer 
Li,  is  defined  as 


^•i*i»3*a  ~  I 


'  k\,k2  €  model  q 
1 S.l,  -£lJ  <  e-i 

—  1,  otherwise 


where  €\  and  e2  are  pre-specified  thresholds.  The  in¬ 
terconnection  weights  between  layer  L\  and  Li  are 
symmetrical.  Fig.  2  shows  the  interconnection  weights 
between  the  two  layers  for  two  nodes  belong  to  the 
same  model  or  to  different  models. 

The  inter-energy  E\i  and  En  are  equal.  They  can 
be  noted  as  Ee  which  stands  for  the  coupling  energy 
between  the  two  layers. 


-EE 

>i=i  *,=i 

(7) 

1  Mi  Ni  M  2  Nj 

Ec—  —  j  5Z  £  53  E  (12) 

*i  =  l  fc|=l  ijsl 

(8) 

Since  the  inter-energy  En  and  En  are  equal,  we  will 
use  a  for  both  »i  and  a2  in  the  remainder  of  this 

/>,*,  =  4, 

(9) 

paper.  Hence,  the  overall  energy  for  the  network  is 

and  =  1  if  t|  =  j\  and  0  otherwise,  and  similarly 
St, t,  =  1  if  ifcj  =  /)  and  0  otherwise.  7},*,^,^  repre¬ 
sents  the  connection  weight  between  a  node  at  (»i,  ki) 
and  a  node  at  (j j,  /|)  within  layer  L\. 

3.4  Interconnection  Weights  between  the 
Layers 

At  each  layer,  the  matched  features  for  each  model 
can  be  used  to  find  the  mapping  (translation  and  ro¬ 
tation  parameters)  between  the  model  and  the  corre¬ 
sponding  object  in  the  input  scene.  The  interconnec¬ 
tions  between  adjacent  layers  of  the  network  are  based 


E  —  Ei  +  Ei  +  2  aEc.  (13) 

It  should  be  pointed  out  that  the  values  for  the  inter¬ 
connection  weights  are  changing  as  the  network  is  up¬ 
dated,  because  as  the  states  of  the  nodes  are  changed 
and  as  more  correct  matches  are  obtained  the  calcu¬ 
lated  values  for  the  translation  and  rotation  will  also 
be  changed. 

3.5  Rate  of  Change  in  Energy 

Consider  changing  the  state  Vit*,  of  a  neuron 
( * i , Jfe i )  in  layer  L j,  i.e.,  is  changed  to  the 


change  in  the  total  energy  is 

A£  =  AE,i+A£2  +  2aA£c.  (14) 

Here,  A Ej  =  0,  since  the  states  of  the  neurons  in  layer 
£2  are  not  changed.  The  change  of  the  energy  E\  is 
given  by  (1) 

A Et  =  -  (£/.-,*,  -2AV'ltl)AViltl  (15) 


where 


AT,  JV. 


j.=n.=t 

(16) 

and 

AK,t,  =  K"*,1  -  Vi?*,.  (17) 

The  change  in  coupling  energy  Ee  can  be  written  as 

A  E<  =  -\zilt,hk3  +  \wi3k,&Vi>k,  (18) 

where 

A/a 

^*|Al*3*3  ^3*3 

•a=i  *a  *i,taemoclel  q 

A/,  <* 

•  ’=1  ti  *,,tj6model  q 

(19) 

Afa  Na-Atf1 

VV*^=E  E  *3*3.  (20) 

*’  ta  i,,tagmodel  q 
Q  Q 

=  EX‘  ’  ^2  =  XX3'  (21) 

r=]  v=i 

<5  is  the  total  number  of  models,  and 

{fct,  fc2  €  the  same  model 

[|<£t,  ~  2<*/J 

!€/,,  -fit,!*'-*,  £*3 

k— 1,  otherwise 

(22) 

^Maiata  an<*  ^bijiata  rePresent  th*  interconnections 
when  state  of  the  neuron  (ii,ki)  is  V*V  or  V^t1 , 
respectively.  Therefore,  the  change  of  total  energy  is 

A E  =  -  [Ui,k,  —  o'H^IJj.3)AV,i,i1  -  aZi,k,i +  2. 

(23) 

Similarly  if  the  state  Viikl  of  a  neuron  (i2,  /t2)  in 
layer  £2  is  changed,  the  change  in  the  total  energy  is 

A E  =  — (t/,-,fca— «Wri|*|)AVriata~ar^<ata«i*i+2  (24) 

where  Ui,k^  is  similar  as  in  Eq.  (16),  Zj3k,i.kl  and 
Wi ,t,  are  similar  as  in  Eq.  (19)-(20)  except  tnat  the 
subscript  1  in  changed  to  the  subscript.  2. 


3.6  Summary  of  the  Updating  Algorithm 

The  updating  algorithm  is  summarized  in  the  fol¬ 
lowing  steps. 

Step  1)  Set  the  initial  state  of  neurons  for  layers 
Li\  and 


Vi-k”  ~  {  0, 


=  /l.  ^ 

1 0,  otherwise 


where  d  is  a  threshold  to  determine  if  feature  frm  and 
/i„  are  compatible,  m  =  1,2. 

Step  2)  Randomly  select  (»i,ifci)  in  layer  L\ . 

Step  3)  Update  the  state  Vi,  *, . 

i/n+1  /  1>  if  fft’iki  ol^iaia  *b  Or^i, 4,1,4,  >  2; 

‘1*1  \0,  if  {/<,*,-  ttl^iata  ~  a^i,t,»ata  <  “2. 

(26) 

Step  4)  Randomly  select  (*2,12)  in  layer  £2. 

Step  5)  Update  the  state  Vj34,. 

i/n+i  _  /  1  if  Ujjfca  -  oWi.t,  +  aZi3k3ilki  >  2; 

«»fca  10,  if  Ui3k3  —  aWilk,  —  <*Zi3k3ilkl  <  -2. 

(27) 

Step  6)  Check  for  the  termination  condition.  If 
it  is  satisfied,  go  to  step  7)  otherwise  go  to  step  2). 

Step  7)  Output  the  final  states  of  neurons  V*,*, 
and  Viai,  which  will  be  the  final  matches  between  the 
model  features  and  the  input  features  in  level  L\  and 
level  £2,  respectively. 

It  is  well  known  that  the  optimal  solution  is  not  al¬ 
ways  attained  for  non-convex  gradient  searches.  Two 
termination  strategies  are  used  in  this  algorithm.  One 
terminates  at  a  local  minimum,  i.e.,  whenever  the  out¬ 
puts  of  all  the  neurons  in  the  network  are  converged 
to  a  local  minimum  in  the  sense  of  unity  Hamming 
distance.  This  guarantees  that  (within  Hamming  dis¬ 
tance  unity  of  the  output)  there  is  no  other  state  of 
lower  energy.  The  other  is  that  when  the  outputs  of 
neurons  are  unchanged  after  a  fixed  number  of  itera¬ 
tions,  the  algorithm  is  terminated. 

4  Results 

In  this  section,  the  merits  of  the  proposed  algorithm 
are  examined  using  several  test  objects  which  are  im¬ 
ages  of  different  door  keys.  Each  image  is  processed 
by  a  QMF  filter  [2]  with  24-tabs  in  order  to  generate 
the  multi-resolution  images  for  the  test  object.  In  the 
next  preprocessing  step,  high  curvature  points  (cor¬ 
ners)  of  the  test  object  are  extracted  at  each  level  of 
the  image  pyramid  separately. 

We  investigated  two  sets  of  test  objects  to  explore 
the  performance  of  our  proposed  multi  layer  network. 
The  first  set  is  composed  of  image  scenes  with  single 
object  (one  key).  In  the  other  set,  we  processed  image 
scenes  that  contained  multiple  occluded  objects  (over¬ 
lapping  keys).  The  scene  graph  pyramid  containing 
two  feature  graphs  at  two  resolution  levels  for  multi¬ 
ple  occluded  objects  is  shown  in  Pig.  3. 

4.1  Single  and  Occluded  Object  Results 

In  this  investigation,  we  formed  our  test  objects  by 
translating  and  rotating  the  key  models.  The  model- 
database  contained  three  keys.  The  single  layer  anil 


the  two-layered  Hopfield  networks  were  both  simu¬ 
lated  and  their  recognition  performance  for  a  single 
and  occluded  object  in  the  input  scene  were  studied. 

The  state  trajectory  and  final  state  at  termination 
depend  on  the  initial  state  vector  and  the  particular 
realization  of  the  random  updating  sequence  by  which 
the  candidate  state  for  updating  is  selected.  Each  net¬ 
work  was  tested  with  17  different  random  updating  se¬ 
quence  realization.  Three  test  objects  are  used  in  our 
experiment,  thus  a  total  of  51  runs  were  performed  for 
both  single  and  occluded  object.  In  the  case  of  single 
object,  for  the  single  layer  Hopfield  net,  the  recog¬ 
nition  success  was  41%;  for  the  two-layered  Hopfield 
net,  the  rate  of  recognition  was  increased  to  82%.  In 
the  case  of  input  scenes  with  occluded  objects,  for  the 
single  layer  Hopfield  net,  the  rate  of  recognition  was 
only  6%;  for  the  two-layered  Hopfield  net,  a  recogni¬ 
tion  rate  was  31%.  The  recognition  of  occluded  object 
is  more  difficult  than  that  of  single  object  since  extra 
matching  ambiguities  are  introduced. 

Our  experiment  also  showed  that  the  computing 
time  for  the  two-layered  Hopfield  net  takes  1  to  3 
times  or  1  to  1.5  times  more  than  that  of  the  one 
layer  Hopfield  net  in  the  single  and  occluded  object, 
respectively.  This  is  because  the  number  of  nodes  in 
the  two-layered  Hopfield  network  is  larger  than  that 
of  the  single  layer  Hopfield  network.  These  findings 
show  that  the  two-layered  Hopfield  network  is  more 
powerful  than  the  single  layer  Hopfield  network. 

4.2  Energy  Function  Behaviour 

To  analyze  the  differences  between  the  one  layer 
Hopfield  net  and  the  two-layered  Hopfield  net,  we  in¬ 
vestigated  the  behaviour  of  the  energy  functions  for 
both  networks.  Because  the  energy  function  is  high 
dimensional  and  non  linear,  it  is  difficult  to  plot  the 
shape  of  the  energy  function  over  the  states  of  the 
network.  Consider  the  energy  function  vs.  the  it¬ 
eration  numbers  as  the  network  seeks  a  stable  state. 
This  presents  grounds  for  comparison  between  the  be¬ 
haviour  of  the  energy'  functions  of  two  networks.  Fig. 
4  shows  plot*  of  the  energy  functions  vs.  the  number 
of  iterations  for  the  single  layer  Hopfield  net  as  well 
as  the  two-layered  Hopfield  net.  when  the  input  is  a 
single  object. 

4.3  Effect  of  interconnection  Parameter  a 

The  interconnection  parameter  «  scales  the  inter¬ 
layer  energy  function  Ec  relative  to  the  intra-laver  en¬ 
ergy  functions  E\  and  fo,  respectively.  It  is  useful 
to  consider  the  effect  of  this  parameter  o  on  the  be¬ 
haviour  of  the  two-layered  Hopfield  network.  We  ex¬ 
amined  the  effect  of  this  interconnection  parameter  on 
the  network  performance  for  or  in  the  range  of  0—1.  In 
Fig.  5  plots  (b)  and  (c)  show  the  energy  functions,  for 
a  close  to  zero,  trapped  in  local  minima  which  are  all 
far  from  the  global  minimum.  For  <»  dose  to  zero,  the 
two-layered  network  behaves  as  two  independent  sin¬ 
gle  layer  networks  without  interconnection.  Forrv  >  1, 
the  inter-energy  function  dominates  the  energy  func¬ 
tion  for  layers  L\  or  Lo,  the  energy  functions  E\  or  Ei 
never  converge  to  a  stable  state  as  shown  in  plot  (a) 
for  E\.  When  n  is  between  0  and  1,  the  energy  func¬ 
tion  E\  can  converge  to  the  global  minimum  or  a  local 


minimum  which  is  close  to  the  global  minimum.  Our 
results  show  that  0.3  <  a  <  0.5  is  a  good  compromise 
for  both  single  and  two  layer  nets. 

5  Conclusions 

In  this  paper  we  have  presented  a  multi-layered 
Hopfield  neural  network  for  object  recognition.  A  sin¬ 
gle  layer  Hopfield  neural  network  has  significant  lim¬ 
itations.  For  example,  it  could  get  trapped  in  one  of 
many  local  minima  of  the  energy  function.  Although  it 
is  difficult  to  fully  analyze  the  dynamics  of  the  energy 
function  for  a  multi-layered  Hopfield  neural  network 
due  to  high  dimensionality  of  the  energy  function,  ex¬ 
perimental  results  indicate  that  the  energy  function  of 
a  multi-layered  Hopfield  neural  network  converges  to 
a  local  minimum  which  is  often  equal  to  or  very  close 
to  the  global  minimum. 

The  matching  process  proposed  in  this  paper  is  a 
concurrent  coarse- and-fine  strategy.  This  choice  ef¬ 
ficiently  utilizes  the  multi-layered  Hopfield  network 
by  reinforcing  the  adjacent  layers  of  the  multi-layered 
network.  The  interconnections  between  the  adjacent 
layers  of  multi-layered  Hopfield  neural  network  are  de¬ 
fined  on  the  basis  of  the  compatibility  characteristics 
of  the  multi-resolution  image  pyramid.  This  is  one  ap¬ 
proach  to  define  interconnections  between  the  layers 
of  the  multi-layered  Hopfield  neural  network  to  make 
the  adjacent  layers  reinforce  each  other.  Other  kind  of 
interconnection  compatibility  conditions  are  currently 
under  investigation. 
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Figure  1  .Architecture  of  multi-layered  Hopfield  net¬ 
work. 
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Figure  3.  A  scene  graph  pyramid  containing  two 
feature  graphs  at  two  resolution  levels  for  multiple  oc¬ 
cluded  objects. 


Figure  4.  Energy  behaviour,  (a)  the  energy 
function  for  the  layer  Li  when  using  single  layer  net¬ 
work,  which  stalled  at  a  local  minimum;  (b)  Ex,  the 
energy  function  for  the  layer  L\  when  using  two¬ 
layered  network,  which  converged  to  the  global  min¬ 
imum;  (c)  0i,  the  total  energy  at  layer  L\,  0i  = 
Ei  +  a  x  Ec  (a  =  0.5);  (d)  Ec,  the  inter-energy,  (e) 
E2,  the  energy  function  for  the  layer  Li  when  using 
single  layer  network;  (f)  E2,  the  energy  function  for 
the  layer  Lj  when  using  two-layered  network. 


Figure  5.  Energy  function  of  the  layer  L\  using  two¬ 
layered  net  for  different  values  of  the  interconnection 
parameter  a.  (a)  o  =  1;  (b)  a  =  0;  (c)  a  =  0.1;  (d) 
a  =  0.3;  (e)  a  =  0.4;  (f)  a  =  0.5;  (g)  a  =  0.75. 


Figure  2.  Inter-connections  of  neurons  for  two¬ 
layered  Hopfield  network. 


